Combining Bias and Variance Reduction Techniques for Regression Trees
نویسندگان
چکیده
Gradient Boosting and bagging applied to regressors can reduce the error due to bias and variance respectively. Alternatively, Stochastic Gradient Boosting (SGB) and Iterated Bagging (IB) attempt to simultaneously reduce the contribution of both bias and variance to error. We provide an extensive empirical analysis of these methods, along with two alternate bias-variance reduction approaches — bagging Gradient Boosting (BagGB) and bagging Stochastic Gradient Boosting (BagSGB). Experimental results demonstrate that SGB does not perform as well as IB or the alternate approaches. Furthermore, results show that, while BagGB and BagSGB perform competitively for low-bias learners, in general, Iterated Bagging is the most effective of these methods.
منابع مشابه
Bias , Variance , and Arcing Classifiers
Recent work has shown that combining multiple versions of unstable classifiers such as trees or neural nets results in reduced test set error. To study this, the concepts of bias and variance of a classifier are defined. Unstable classifiers can have universally low bias. Their problem is high variance. Combining multiple versions is a variance reducing device. One of the most effective is bagg...
متن کاملInvestigation and Reduction of Discretization Variance in Decision Tree Induction
This paper focuses on the variance introduced by the dis-cretization techniques used to handle continuous attributes in decision tree induction. Diierent discretization procedures are rst studied empirically , then means to reduce the discretization variance are proposed. The experiment shows that discretization variance is large and that it is possible to reduce it signiicantly without notable...
متن کاملCombining Global and Local Grid-Based Bias Correction for Mesoscale Numerical Weather Prediction Models
Two methods for objective grid-based bias removal in mesoscale numerical weather prediction models are proposed, one global and one local. The global method is an elaboration of model output statistics (MOS), combining several modern methods for multiple regression: alternating conditional expectation (ACE), regression trees, and Bayesian model selection. This allows the representation of nonli...
متن کاملImproved customer choice predictions using ensemble methods
In this paper various ensemble learning methods from machine learning and statistics are considered and applied to the customer choice modeling problem. The application of ensemble learning usually improves the prediction quality of flexible models like decision trees and thus leads to improved predictions. We give experimental results for two real-life marketing datasets using decision trees, ...
متن کاملInducing Polynomial Equations for Regression
Regression methods aim at inducing models of numeric data. While most state-of-the-art machine learning methods for regression focus on inducing piecewise regression models (regression and model trees), we investigate the predictive performance of regression models based on polynomial equations. We present Ciper, an efficient method for inducing polynomial equations and empirically evaluate its...
متن کامل